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ABSTRACT 


Understanding the affect expressed by learners is essential 
for enriching the learning experience in Massive Open On- 
line Courses (MOOCs). However, online learning environ- 
ments, especially MOOCs, pose several challenges in un- 
derstanding the different types of affect experienced by a 
learner. In this paper, we define two categories of emotions, 
explicit emotions as those collected directly from the stu- 
dent through self-reported surveys, and implicit emotions as 
those inferred unobtrusively during the learning process. We 
also introduce positivity as a measure to study the valence 
reported by students chronologically, and use it to derive in- 
sights into their emotion patterns and their association with 
learning outcomes. We show that implicit and explicit emo- 
tions expressed by students within the context of a MOOC 
are independent of each other, however, they correlate better 
with students’ behavior compared to their valence. 
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1. INTRODUCTION 

The exploration of emotions expressed by students in Mas- 
sive Open Online Courses (MOOCs) has caught the atten- 
tion of researchers for improving the remote and non-contact 
learning experience [28, 8, 16, 5]. A few examples of these 
studies infer emotions of students from their behavior [16], 
surveys collected during the course [8, 1], clickstream data 
and discussion forums [28, 5]. The relationship between stu- 
dents’ emotions and their behavior, learning outcomes, en- 
gagement, and dropout within the MOOC context is estab- 
lished in [1, 25, 21]. 


Emotions experienced by students during a course impact 
their behavior and learning outcomes [15, 19]. Detecting the 
emotion experienced during learning is difficult, and various 
methods have been employed for this purpose. The meth- 
ods used to sample emotions mainly fall into three categories 
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as outlined by [27]. The first category consists of methods 
that take snapshots of students’ emotions during the course 
through survey questionnaires. These methods are intrusive 
to the learning process and are usually self-reported and 
subjective in nature. The second category detects emotions 
during the learning process and includes methods that sam- 
ple emotions non-intrusively like facial expression detection, 
conversations, gaze detection, and analysis of text data gen- 
erated by student interactions within the course [9, 10]. The 
third category measures emotions after the learning process. 
The first two categories are relevant to our paper. In [27], 
the methods in the second category are assumed to coun- 
teract the limitations of the methods in the first category. 
Therefore, in our study we use two categories of emotions 
to get a more complete view of students’ emotional states. 
In this paper, we measure explicit emotions as the emotions 
recorded from student’s self-reported surveys and Self- As- 
sessment Manikins (SAMs), and implicit emotions as those 
from the open discussion forum posts of students. 


Emotions measured in association with learning seem to be 
short-lived and last for a few seconds to minutes [15]. Since 
the emotions were expressed by students in this MOOC at 
different, non-uniform points in time, one of the challenges 
of analyzing such a series is the spontaneity of emotions. 
As the emotions are surveyed after the end of a video or 
module, we only get a snapshot of the students’ emotions 
during the course [27]. Between two consecutive surveys, a 
student’s emotions can not only change multiple times, but 
also be conflicting, as students can experience multiple emo- 
tions simultaneously [1], which could hinder a chronological 
analysis of the emotions. However, even if students’ emo- 
tions are spontaneous and likely to be fraught with missing 
data, there might be a trend to their emotions over time. 
An approach that leverages this idea has been proposed in 
[7], where the positive affect experienced by an individual 
is averaged over a period of time while the negative reports 
are ignored. Inspired by this technique, we also calculate the 
“positivity” of students at each point of the reported emo- 
tions and derive a positivity sequence instead of an emotion 
sequence. This positivity sequence is expected to be more 
stable over time as compared to the emotion sequence. 


We study the implicit and explicit emotions expressed by the 
MOOC students through the following research questions. 


RQ!1: Are the explicit and implicit emotions expressed within 
a MOOC context similar? Can one be used as a proxy for 
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the other or are both of them equally important for charac- 
terizing a student’s emotional state? 

RQ2: What do the combined (explicit plus implicit) emo- 
tional states and positivity sequences characterize about a 
student’s learning? 


To the best of our knowledge, this is the first attempt at 
investigating the effect of explicit and implicit emotion cat- 
egories within a MOOC context. We find that implicit and 
explicit emotions expressed by students are indeed different 
and both are necessary to characterize student emotions. We 
also see that combined positivity values correlate relatively 
well with behavior compared to their valence values. 


2. RELATED WORK 


The comparison of self-reported metrics like emotions and 
performance in self-regulated learning and other educational 
contexts has been studied and generally found to be incon- 
sistent with the measured reports [14, 26, 29]. While many 
of these studies measure the alignment of students’ achieve- 
ment calibration with their actual performance [13, 26, 14], 
we aim to compare the self-reported emotions of students in 
MOOGCs against the emotions we measure from their behav- 
ior in the MOOC, in the form of interactions on the discus- 
sion forum. A direct comparison of these methods with ours 
is infeasible because of the difference in instrumentation and 
methodology. However, we will compare our general obser- 
vations with the trends in literature. 


We use students’ self-reports of emotions along with Self- 
Assessment Manikin (SAM) as the explicit measures of stu- 
dents’ emotions. Self-reports are a very common way of 
measuring students’ emotions because of their subjective na- 
ture [11]. Collecting students’ emotions through surveys is 
easy to deploy on a large-scale and is low cost [11], which 
makes them favourable for use in MOOCs [1]. SAM is a 
non-verbal assessment technique that allows people to rate 
their pleasure, represented as valence in our case, on an or- 
dinal scale [4]. SAMs have been used to measure emotion in 
online learning environments [6, 8]. 


Among the techniques available for detecting the implicitly 
expressed emotions of students, analyzing emotions from 
texts is one of the least invasive ways of detecting students’ 
emotions [17, 22]. Using discussion forums to detect stu- 
dents’ emotions in MOOCs is becoming prominent due to 
its unobtrusiveness and low instrumentation [28]. Many 
sentiment analysis techniques for detecting valence from text 
including the word-affect lexicon used in this paper are listed 
in [18], and education has been noted as one of the applica- 
tions of sentiment analysis. We use Warriner’s [24] word- 
affect lexicon to calculate the valence values of words in 
the discussion forum records. The effectiveness of War- 
riner’s word-affect lexicon [24] for sentiment analysis has 
been demonstrated for detecting sarcasm [20], finding geo- 
graphical locations associated with happier tweets [12], etc. 
This automatic method to detect affect from discussion fo- 
rum data enables a scalable way to glean implicit affect in 
MOOCs from a large number of forum posts. Sentiment 
analysis polarity techniques were applied on discussion fo- 
rum posts in [25]. In [28], a Mechanical Turk is used to 
obtain confusion ratings among students through simple fea- 
tures like counting the number of question marks to predict 


Table 1: 1. Number of students vs. SAM surveys 
2. Number of students vs. SAM scores 
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the level of confusion in the discussion forum posts. They 
also use Linguistic Inquiry and Word Count (LIWC) to con- 
sider negation words and phrases as an indicator of potential 
confusion, and clickstream patterns (eg. quiz-quiz-forum) as 
a feature for detecting confusion. Previous research on us- 
ing the discussion forum to estimate student retention and 
performance is complicated due to a vast amount of missing 
and imbalanced data [3]. We also face challenges to detect 
implicit emotions in the midst of context-specific terms. 


3. DATA DESCRIPTION 


3.1 Course Description 

We use the data from the introductory course on Statistics 
called “I Heart Stats” for our study. This was a self-paced 
MOOC on the EdX platform, and the entire course content 
was released at the start of the course. The course had nine 
modules, with the ninth module being for the assessment of 
the overall course. During the course, students were asked 
to self-report their emotions and valence through emotion 
surveys and SAM surveys respectively. Initially 24,279 stu- 
dents were enrolled in the course, however, only less than 
15,000 students had activity in the first two weeks. Finally, 
only 1,941 students completed it. Of all the students, 1,629 
responded to at least one emotion or SAM survey, and par- 
ticipated in the discussion forum as well. Only these stu- 
dents have been included in the Analysis section of the pa- 
per as these are the only students generating both implicit 
as well as explicit emotions. Note that students completing 
the course are likely to have longer sequence lengths. Stu- 
dents not interacting with the discussion forum but are still 
part of the course cannot be included in the analysis leading 
to an overrepresentation of active users. 


3.2 Explicit Emotions 

Emotion Surveys: Of all the students, 6,100 submitted 
21,448 emotion surveys. During the course, 12 emotion sur- 
veys were conducted in which students self-reported their 
current emotional state. This was optional and students 
could choose multiple of a list of 15 emotions: anger, anx- 
iety, boredom, confusion, contentment, disappointment, en- 
joyment, frustration, hope, hopelessness, isolation, pride, re- 
lief, sadness, and shame. Further details can be found in [1]. 
The valence values of these emotions were calculated using 
Warriner’s lexicon [24], with a scale of 1 to 9 and 5 being 
neutral. We shift the scale to [-4, 4] to bring the neutral va- 
lence to 0. In the case of multiple emotions being expressed, 
the associated valence values were averaged to obtain one 
valence value per survey. Thus, the surveys have positive 
(0, 4], negative [-4, 0), and neutral {0} valence values. 
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Figure 1: Histogram of implicit, explicit, and com- 
bined sequence lengths (until sequence length of 25) 


SAM Surveys: A total of 5 SAM surveys, using a 5-point 
scale, were conducted in this MOOC. The SAM score repre- 
sented in Table 1 ranges from 1 to 5 with 1 being the least 
and 5 being the highest state of pleasure. As the distribu- 
tion of the number of students corresponding to each SAM 
score is normal, we convert this scale to an interval scale in 
the range [-4, 4] linearly. In total, 5,363 students have sub- 
mitted 9,512 SAM surveys with the rest of the details shows 
in Table 1. 


3.3. Implicit Emotions 

The discussion forum is a platform that students use to in- 
teract with each other, the instructor, and teaching assistant 
of the MOOC. In total, 1,717 students generated 5,322 dis- 
cussion forum records. The posts, comments, and replies 
(ie. records) on the discussion forum are used to infer the 
implicit emotions of students. 


We use Warriner’s word-affect lexicon [24] to calculate the 
valence values of discussion form records. The tokenized 
words in tweets are used to calculate the mean valence value 
of the tweet using Warriner’s word-affect lexicon. We use a 
similar approach to calculate valence values for discussion 
forum records using the following steps: (i) Tokenize the 
records to get a list of words, (ii) Remove the stop words 
from the list, (iii) Make a list v of valence values associated 
with a word using the lexicon, if present, after re-scaling 
them between [-4, 4], (iv) Multiply the valence values of 
words/phrases that follow a negative word with —1 (eg. not, 
never), and (iv) Return the average valence value of list v. 


3.4 Combined Emotions 

Throughout the course, students have multiple opportuni- 
ties, explicit or implicit, to express their emotions. The 12 
emotion surveys, 5 SAM surveys, and valence values cal- 
culated from discussion forum records were interleaved and 
ordered chronologically for each student to form a combined 
sequence of valence values. 


A histogram of the number of reports corresponding to the 
number of students in Figure 3.4 shows that the highest 
number of students (14%) has a maximum combined se- 
quence length of 3 with the number of students tapering 


down after that point. The maximum number of reports 
corresponding to a student is 74, as this student was very 
active in the discussion forum. 


To mitigate the spontaneous nature of emotions, we calcu- 
late the positivity of students at each report from the valence 
sequence values. Thus, if a student reports one negative 
emotion among a string of positive emotions, the impact of 
the negative emotion is reduced because of the previously 
expressed positive emotions. We define positivity as follows. 


Positivity: Let r1,r2,...,7n be the reports made by a stu- 
dent until element n such that: 

timestamp(ri-1) < timestamp(r;) for all i. The valences 
are normalized between [-1, 1], instead of [-4, 4], by divid- 
ing them by 4. Let pi, p2,...,pm be the positive normalized 
valences where m <= n and m+1>n. The positivity at 
the nth element is given by (p1 + p2 +... + pm)/n. 


In other words, an element of the positivity sequence is cal- 
culated by averaging over only the positive valences in the 
sequence until that element. Since students have reported 
more positive than negative valences both explicitly and im- 
plicitly, calculating negativity instead of positivity would 
lead to extremely sparse sequences. 


4. ANALYSIS 
4.1 Calculated Valences 


Section 3.3 lists the steps to calculate the valence values 
of the discussion forum records. To validate these valence 
values, 440 samples of the discussion forum records were 
manually annotated by three human raters in which each 
rater chooses one, two, or none of the 15 emotion choices 
that students had for their emotion surveys. The fourth 
rater is the calculated valence. We use Fleiss’ Kappa [2] 
to calculate the inter-rater agreement by converting the va- 
lence scores to positive, negative, or zero valence. The inter- 
rater agreement of the three human raters is 0.457 (moder- 
ate agreement), whereas the inter-rater agreement of the 
four raters including the calculated valences is 0.218 (fair 
agreement) [23]. While the agreement including the calcu- 
lated valences is lower, it is adequate, and so we use the 
calculated valence of these discussion forum records as the 
implicit valence values. 


4.2 Implicit vs. Explicit features (RQ1) 

Both implicit and explicit sequences are instances of irreg- 
ular time-series data. However, since emotion data is spon- 
taneous and might change multiple times between consec- 
utive reports [15], averaging, downsampling, interpolating 
or duplicating valence values in an emotion sequence might 
misrepresent the true emotional trajectory of the student. 


4.2.1 Feature vectors description 

Since the valence sequences are not uniform in length, we 
create fixed length feature vectors for analysis. The features 
are used in Sections 4.2.2 and 4.2.3 with their description 
given: (i) pos: ratio of the number of positive valences to 
the total length of the sequence (ii) neg: ratio of the num- 
ber of negative valences to the total length of the sequence 
(iii) neu: ratio of the number of neutral valences to the to- 
tal length of the sequence (iv) trans: ratio of the number 
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Table 2: Corr. between implicit & explicit features 


Features | Pearson’s r | Spearman’s rho 
os 0.0401 0.0696** 
neg 0.0413* 0.102*** 
neu 0.0150 0.0380 
seq-len 0.346*** 0.422*** 
trans 0.125*** 0.162*** 
neg- pos 0.113*** 0.165*** 
pos_neg 0.0805** 0.1274" 
range 0.243*** 0.257*** 


*:p-val.<0.1, **:p-val.<0.05, ***:p-val.<0.0001 


of transition of valences from positive to negative or vice 
versa in the sequence to the sequence length (v) pos-neg: 
ratio of the number of transition of valences from positive 
to negative to the sequence length (vi) neg-pos: ratio of the 
number of transition of valences from negative to positive 
to the sequence length (vii) range: calculated by subtract- 
ing the minimum valence value from the maximum valence 
value expressed (To normalize the value the resulting range 
is divided by 8, as the valence values lie in the range |[-4, 
4].) (viii) seq len: length of the valence sequence (integral 
value). 


4.2.2 Correlation 

In Table 2, we see that pos, neg, and neu, as defined in 
Section 4.2.1, between implicit and explicit emotions of stu- 
dents are not correlated with each other. This shows that 
both types of sequences are somewhat independent of each 
other and might show different insights into students’ affect. 
There are relatively few neutral discussion forum records 
which is why its correlation with completion is not signifi- 
cant. That is why transitions from neutral to positive and 
negative valences, and vice-versa have been left out of the 
features list. The sequence lengths seem to be mildly cor- 
related showing that students reporting more emotions in 
the emotion surveys were also more likely to submit more 
records in the discussion forum. This correlation is expected 
since the number of students with larger sequence lengths 
decreases as seen from Figure 3.4. 


4.2.3 Clustering of Feature Vectors 

We cluster the 7-dimensional feature vector to identify groups 
of similar students using K-Means. To visualize the clus- 
ters created, we decompose the 7-dimensional feature vec- 
tors of students’ implicit and explicit emotion sequences to 
a 2-dimensional space using Principal Component Analysis 
(PCA) separately. The PCA decomposition in Figure 4.2.3 
shows very separable clusters in the 2-dimensional space. 
The explicit clusters have significantly different ratios of 
course completion: orange: 37.2%, purple: 25.5%, olive: 
51.9%. Similarly, the completion ratios of the implicit clus- 
ters are: red: 34.5%, blue 32.6%:, green: 60.3%, with the 
green cluster having significantly more students completing 
the course than the other two. 


4.3 Combined sequence features (RQ2) 

From the previous subsection, we saw that implicit and ex- 
plicit sequences are not identical and should both be incor- 
porated into a student’s valence trajectory. So we use both 


os Mm 457 students 
Mi 615 students 


06 | Mill 557 students 


cormponent2 


Se ee ee es 
-0.6 -0.4 -0.2 a0 2 o4 o6 a8 
componentl 
08 { 
Ml 87 students © 
og | MM 1242 students é 


Mm 300 students 


component2 


-0.2 Ad) G2 a4 6 a8 La 12 
componentl 


Figure 2: PCA decomposition of explicit (top) and 
implicit (bottom) seq. clusters (’x’: cluster centers) 


implicit and explicit sources of emotions ordered by time 
to generate a combined valence sequence for students. The 
features from Section 4.2.1 are used in the analysis below. 


4.3.1 Correlation of features with completion 

We generate the 7-dimensional feature vector from the com- 
bined valence sequence for each as defined in Section 4.2.1 
and show the correlation of each dimension with completion 
in Table 3. Completion is defined by a student reaching 
module 8 [1]. We see that seq_len has the highest correla- 
tion with completion possibly because sequence length could 
act as proxy for the amount of time students spent in the 
course. A similar reasoning might hold for trans. The pos, 
neg, or neu features do not seem to be correlated with com- 
pletion. However, neg_pos seems to be better correlated with 


Table 3: Corr. of combined vectors with completion 
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Table 4: Corr. of features with quiz performance 
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Figure 3: Positivity Clustering of Combined Seqs. 


completion than pos_neg. This supports our intuition that 
students transitioning from a negative to positive emotional 
state are more likely to stay in the course, compared to the 
other way round. The feature range is better correlated with 
completion than trans which indicates that higher intensity 
of changes in emotions is more likely to result in completion. 


4.3.2 Correlation of features with Quiz Performance 
The performance score of students for a quiz is normalized 
between 0 and 1. The average, minimum, and maximum 
performance score of the quizzes (total 4) that students have 
attempted is used as the y-variable for correlation. The 
features that are significantly correlated with these statis- 
tics using Pearson’s correlation are in Table 4. While the 
negative correlation with seq_len is unsurprising given that 
harder quizzes are towards the end of the course, the positive 
correlation with range suggests that student who experience 
extreme emotions tend to perform better. 


4.4 Positivity clustering (RQ2) 

We compare fixed length positivity sequences by clustering 
the first 10 elements of 767 students who have a sequence 
length of at least 10. We see that k=3 is the highest num- 
ber that shows no overlap of cluster centers. While there is 
no significant difference between the clusters for quiz per- 
formance, the difference between clusters in terms of quiz 
participation using ANOVA is significant at p-value < 0.05. 
Specifically, in the k=3 chart in Figure 4.4, there are more 
students in the most positive (green) cluster that do not sub- 
mit a single quiz (29.3%) than the other two clusters (20%). 
A possible explanation is that students had trouble with the 
quizzes and the ones who did not attempt them were more 
likely to be happier. All three cluster centers converge to- 


wards a narrow range of positivity, suggesting that students 
tend towards the same positivity in the course even though 
they started out differently. 


5. DISCUSSION AND FUTURE WORK 
Similar to the studies [14, 26, 29], we found that the self- 
reported emotions did not reflect the implicitly measured 
emotions. Clustering students by their emotion sequence 
had different ratios of students that completed the course in 
each cluster. This observation is similar to what [14] found 
about different learning strategies and activity of students. 
To investigate whether the temporally proximal self-report 
was correlated with the outcome completion, we measured 
the correlation of the last reported valence and the final pos- 
itivity in the students’ sequences with completion. However, 
similar to [29], we found no correlation. This suggests that 
the proximity of students’ emotions to the outcome comple- 
tion does not have a bearing on completion. 


Through RQ1, we show that both the implicit and explicit 
emotion sequences are independent of each other and con- 
tribute different emotional information. Through RQ2, we 
showed that students tend to converge towards the same 
positivity even though they start out differently, indicating 
that they end up feeling the same way. This might be be- 
cause of external factors that remained constant for all the 
students, e.g., how the course was conducted, possibly ex- 
plaining the lack of correlation with the course outcomes. 
We see significant differences between these clusters in quiz 
participation but not in other learning outcomes. This may 
be because students who did not attempt the quizzes did not 
struggle through the course and remained relatively happy. 
Our results show that there is potential for identifying dif- 
ferent groups of students that participate in a MOOC. Table 
2 shows that the explicit and implicit sequences are associ- 
ated with behavior, but not valence. One of the possible 
reasons is that students who participate more in the discus- 
sion forum tend to submit more surveys as well but the two 
types of sequences do not corroborate each other in valence. 
From Table 3, we also observe that students who feel neg- 
atively about the course and then transition to a positive 
emotional state are more likely to stay in the course. We 
found that the range of valence that students experience is 
more indicative of their course completion and quiz perfor- 
mance possibly because the students who struggle through 
the course report higher valence values after achieving their 
course objectives, resulting in their highly varied emotions. 


A limitation of our work is our sentiment analysis technique 
that uses a bag-of-words model with the discussion forum 
records only and does not consider other implicit measures 
of emotions. In this work, we have only relied on a single 
word-affect lexicon. However, we can make the calculated 
valence values more stable by triangulating the valences with 
other lexicons. We would also like to improve granularity 
and quantify the extra information conveyed by either type 
of emotion sequence. Even so, as most emotion research in 
MOOC relies on only one category of emotions, we conclude 
that it might be advantageous for researchers in this area 
to supplement their current method with a method from 
the other category of emotions. It is important to continue 
exploring emotions in MOOCs in pursuit of goals such as the 
personalization of MOOCs, improving the emotional well- 


Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019) 436 


being of students, and the design of MOOCs. 
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